Skip to content

feat(docker): optimize concurrency performance and memory management#1689

Closed
mzyfree wants to merge 4 commits intounclecode:developfrom
mzyfree:perf/concurrency-memory-optimization
Closed

feat(docker): optimize concurrency performance and memory management#1689
mzyfree wants to merge 4 commits intounclecode:developfrom
mzyfree:perf/concurrency-memory-optimization

Conversation

@mzyfree
Copy link

@mzyfree mzyfree commented Jan 4, 2026

Summary

This PR introduces a comprehensive optimization suite for crawl4ai in high-concurrency Docker environments. It focuses on improving QPS (Queries Per Second) and ensuring long-term memory stability by re-engineering the browser pooling mechanism and introducing optional resource filtering.

Key Design Principle: All new features are opt-in. By default, the system behaves exactly as before, ensuring zero impact on existing community users.

Core Enhancements:

  • Tiered Browser Pooling: Replaced the basic pool with a tiered system (Hot, Cold, Retired) to better manage instance lifecycles.
  • Reference Counting: Implemented active_requests tracking to prevent browsers from being closed while still processing requests, fixing common "Target closed" errors under load.
  • Aggressive Memory Management: Added a browser retirement mechanism that recycles instances after a certain number of uses or when system memory pressure is high.
  • Resource Optimization: Added optional blocking for CSS and ad-related network requests, significantly reducing the memory footprint per instance.
  • Observability: Added optional pool audit logs for real-time monitoring of browser health and usage.

New Configuration Options

These new features can be enabled via BrowserConfig or Environment Variables:

Engine Layer (BrowserConfig)

  • avoid_ads (bool, default: False): Enable intercepting and blocking ad/tracker network requests.
  • avoid_css (bool, default: False): Enable blocking CSS resource loading to save CPU/Memory.

Docker Layer (Environment Variables)

  • CRAWL4AI_BROWSER_RETIREMENT_ENABLED (default: false): Enable the usage/memory-based retirement mechanism.
  • CRAWL4AI_PERMANENT_BROWSER_DISABLED (default: false): If true, disables the always-on permanent browser instance.
  • CRAWL4AI_POOL_AUDIT_ENABLED (default: false): Enable detailed pool status logging every 5 minutes.
  • CRAWL4AI_BROWSER_MAX_USAGE (default: 100): Max requests per instance before retirement.
  • CRAWL4AI_MEMORY_RETIRE_THRESHOLD (default: 75): System memory % to trigger aggressive retirement.

List of files changed and why

  • crawl4ai/async_configs.py: Added new parameters to BrowserConfig.
  • crawl4ai/browser_manager.py: Implemented the network interception logic for resource filtering.
  • deploy/docker/crawler_pool.py: Implemented the tiered pool, retirement, and audit logic.
  • deploy/docker/api.py & deploy/docker/server.py: Updated with try...finally for accurate reference counting.

How Has This Been Tested?

  • Stress Testing: Performed high-concurrency load tests (50+ concurrent requests) on a Kubernetes cluster. Observed a significant increase in sustained QPS without OOM issues.
  • Memory Stability: Verified that the "Retirement" and "Janitor" logic successfully reclaims memory during and after high-load periods.
  • Backward Compatibility: Confirmed that the system remains stable and identical in behavior when all new toggles are set to their default (False) values.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added/updated unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Stress test performance

QPS increased by 40%

image

Resource with no OOM

image

dylan.min and others added 2 commits January 4, 2026 09:25
This commit consolidates several optimizations for crawl4ai in high-concurrency environments:

1. Browser Pool Optimization:
   - Implemented a tiered browser pool (Hot, Cold, Retired).
   - Added a browser retirement mechanism based on usage count (MAX_USAGE_COUNT) and memory pressure (MEMORY_RETIRE_THRESHOLD).
   - Added reference counting (active_requests) to ensure browser instances are not closed while in use.
   - Enhanced the pool janitor with adaptive cleanup intervals based on system memory.

2. Resource Loading Optimization:
   - Integrated optional CSS and Ad blocking to reduce memory footprint and improve QPS.
   - Decoupled resource filtering from text_mode to allow granular control.

3. Stability and Scalability:
   - Added mandatory release_crawler calls in API/Server handlers to prevent resource leaks.
   - Introduced environment variables to toggle these new features (defaulting to False for safe community adoption).
   - Added optional 5-minute pool audit logs for better observability.

Co-authored-by: dylan.min <dylan.min@example.com>
…eanup docs

- Refactor BrowserManager to dynamically block resources based on avoid_css and text_mode
- Align text_mode behavior with community standards (no forced CSS blocking)
- Add Top 20 curated ad and tracker patterns for performance
- Restore and translate permanent browser logs in crawler_pool.py
- Clean up models.py schema annotations and server.py docstrings
- Add unit and functional tests for filtering flags
@mzyfree mzyfree force-pushed the perf/concurrency-memory-optimization branch from 01fb7ee to 47bc688 Compare January 6, 2026 05:13
@mzyfree
Copy link
Author

mzyfree commented Jan 8, 2026

@unclecode @ntohidi please review this MR

@chrizzly2309
Copy link

@mzyfree +1, better supported in high concurrence envs is needed

@AlbertInRC
Copy link

Look forward to it as well! The current performance is very POOR. The QPS is <1 for 2 CPU + 4GB RAM, tried to fetch 3 URLs in one request

@AlbertInRC
Copy link

@ntohidi @aravindkarnam Pls help...

@AlbertInRC
Copy link

Any update pls?

@AlbertInRC
Copy link

Anyone is looking at this issue pls?

@AlbertInRC
Copy link

@unclecode @ntohidi @ara
Could you pls help take a look? We are facing the performance issue as well. The temp solution is to use this PR.

@unclecode
Copy link
Owner

@mzyfree Thx for this PR, u have done a good job here, I will review this soon. Sorry for later reply, been very busy preparing our hosted platform for Crawl4ai. @AlbertInRC thx mentioning me on this.

Resolve conflicts in async_configs and docker server while keeping avoid_ads/avoid_css and upstream init_scripts, and retaining upstream URL scheme validation.
@mzyfree
Copy link
Author

mzyfree commented Feb 24, 2026

@unclecode

Thanks for the update! Totally understand — appreciate you taking the time to review.
Please let me know if you’d like me to adjust anything or add more benchmarks/test results.
I'm really looking forward to trying out your cloud-hosted version.

@ntohidi ntohidi changed the base branch from main to develop February 25, 2026 01:27
@unclecode
Copy link
Owner

unclecode commented Feb 25, 2026

@mzyfree Thanks for this excellent PR - the analysis of pool-level resource leaks and the avoid_ads/avoid_css idea were spot-on.

We've been doing a lot of internal refactoring on the browser manager and pool layers recently, so rather than merging this directly (it would need significant rebasing), we've implemented the core ideas from your PR ourselves, adapted to the current codebase:

  • avoid_ads / avoid_css BrowserConfig flags - opt-in resource filtering with ad/tracker domain blocking and CSS route blocking
  • release_crawler() + active_requests tracking - proper pool-level lifecycle management so the janitor doesn't close browsers with in-flight requests
  • finally blocks in all API/server handlers - fixing the resource leak you identified

We intentionally left out the browser retirement mechanism for now since it overlaps with our existing max_pages_before_recycle at the context level - we want to design a unified approach rather than having two competing recycling mechanisms.

These changes are already pushed to the develop branch and will be available in the next version release. We'll be adding your name to the CONTRIBUTORS file as well. Really solid work here - the stress testing data and the tiered pool analysis were very helpful in validating the approach. Closing this PR in favor of the merged implementation, but please keep the contributions coming - this kind of deep performance analysis is exactly what the project needs! You may close this PR.

unclecode added a commit that referenced this pull request Feb 25, 2026
…ecycle

Add opt-in BrowserConfig flags (avoid_ads, avoid_css) for blocking ad/tracker
domains and CSS resources at the browser context level. Refactor crawler pool
with release_crawler() and active_requests tracking to prevent janitor from
closing browsers with in-flight requests. Add proper finally blocks to all
Docker API/server handlers. Update docs for new config options.

Inspired by #1689.
unclecode added a commit that referenced this pull request Feb 25, 2026
@unclecode
Copy link
Owner

Thanks for the contribution! This fix has already been implemented on the develop branch via a different approach and will be in the next release. Closing as superseded - appreciate your effort!

@unclecode unclecode closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants